Artificial Neural Network: A New Approach for QSAR
Study
Parimal M. Prajapati1, Yatri
R. Shah2 and Dhrubo Jyoti
Sen3
1I. K. Patel College of Pharmaceutical Education and Research, Samarth
Campus, Opp. Sabar Dairy, Himmatnagar-383001, Sabarkantha,
Gujarat
2Shree H. N. Shukla Institute of
Pharmaceutical Education and Research, Behind Marketing Yard, Nr. Lalpari Lake, Amargadh (Bhichari), Rajkot, Gujarat
3Department of Pharmaceutical Chemistry, Shri Sarvajanik Pharmacy College, Hemchandracharya
North Gujarat University, Arvind Baug,
Mehsana-384001, Gujarat, India,
ABSTRACT:
KEYWORDS:
INTRODUCTION:
An
artificial neural network (ANN), usually called "neural network"
(NN), is a mathematical model or computational model that tries to simulate the
structure and/or functional aspects of biological neural networks. It consists of
an interconnected group of artificial
neurons and processes information using a connectionist
approach to computation. In most cases an ANN is an adaptive
system that changes its structure based on external or internal
information that flows through the network during the learning phase. Neural
networks are non-linear statistical data modeling
tools. They can be used to model complex relationships between inputs and outputs
or to find patterns in data1.
The role of the medicinal chemist has remained
essentially unchanged for the past 50 years. Their role is dictated by their
quest for rapid and efficient methods that optimize biological activity through
structural variations. This need is historically driven by the fact that, on
average, approximately 10,000 compounds are prepared and evaluated for every 1
that becomes a marketable drug. Current cost of drug development is nearly $600
million. Additionally, increased development time has shortened the
useful patent life in which companies can recover their costs; one-third of
which are estimated to occur in the lead generation, discovery, and
optimization phase.
The dramatic increase in the cost of discovery
resources is highlighted by the fact that a single, traditionally synthesized
compound is estimated to cost $6,000 per research size sample2.
These factors have contributed to a paradigm shift in the way pharmaceutical
research is being conducted; companies are adopting approaches which reduce
costs early in the drug development process.
Figure-1:
Artificial Neural Network
Combinatorial chemistry and high through-put screening
(HTS) are exciting techniques that are being adopted by the pharmaceutical and
agrochemical industries in an effort to reduce costs and shorten discovery and
optimization time. Computational scientists are contributing to this effort
through combinatorial chemistry library analysis, diversity analysis, and
quantitative structure activity relationship (QSAR) studies. QSAR studies rely
heavily upon statistics to derive mathematical models which relate the
biological activity of a series of compounds to one or more properties of the
molecules. These properties, or descriptors, may be derived from numerous
sources including refractive index, octanol/water
partition coefficient or spectral data. In cases where experimental values for
these properties are not available, several programs, including the popular3. CLOGP program, can be
used for the computation of octanol/water partition
coefficients alternatively, theoretical properties also may be obtained from
computational programs such as MOPAC4. A plethora of graph
theory-based topological descriptors are available from programs such as
MOLCONN-X5. Extensive lists of substituent parameters
describing electronic (sigma), lipophilic (pi), and steric properties (cMR and Taft
coefficients) are also available. The initial phase of a QSAR study requires
the collection of many of these descriptors prior to model building6.
Figure-2: Neural
Network
The seminal work in the field of QSAR was report by Hansch et. al. who demonstrated the use of regression analysis for
model building7. In the intervening years since Hansch introduced regression analysis to chemistry, other
methods have been developed and explored to circumvent some of the problems
associated with this technique.
The success of regression analysis in QSAR model
building depends upon an assumed linear relationship between the biological
activity and one or more descriptors. As the number of descriptors increase
however, regression analysis becomes problematic. One problem likely to occur
in large descriptor sets, for example, is redundancy in information when
descriptors are correlated. Latent variable techniques have become accepted
methods of addressing this issue8. These techniques include
the use of principal components in regression analysis and the method of
partial least squares. A second problem encountered in using regression
analysis is the a priori assumption of a model form (i.e. quadratic,
cubic, use of cross terms, etc.). In order to address this issue, variable
selection techniques such as stepwise forward and stepwise backward multiple
linear regression analysis (MLR) were introduced. One recurrent problem in all
of these methods is the fact that by using computational methods to generate
descriptors, a modern dataset may contain more descriptors than compounds, that
is, more columns of parameters than rows of compounds. These results in the introduction
of the insidious problem described by Topliss and
Edwards - those correlations observed may be chance correlations9.
Figure-3: QSAR study
Intriguing approaches using machine learning methods
have been under study in the field of chemistry for the past decade. The first
description of a simple neural network was provided in 194310. Interest in neural networks was slow until the 1980s
when new computer architecture and learning algorithms began to appear. The use
of artificial neural networks (ANN) in all fields has since grown
substantially. In 1988, Hoskins et. al. reported the first use
of process control in chemistry11. This was followed by two
reports using ANN for prediction of protein secondary structure. The use of ANN
in chemistry has further expanded into the analysis of spectral data,
pharmaceutical product development, classification of anticancer compounds,
prediction of chemical reactivity, physical properties, electrostatic
potential, ionization potentials as well as QSARs12.
Neural networks are part of a new era of evolving
computer technology in which a computer system has been designed to learn from
data in a manner emulating the learning pattern in the brain. Neural networks
are typically used when there are a large number of observations and when the
problem is not understood well enough to write a procedural program or expert
system. Using neural networks, the solution to the problem is sought as
follows:
1.
An answer is
calculated by multiplying each input by the connection weight;
2.
Products are
summed at each hidden unit where a non-linear transfer function is applied; and
3.
The output of each
hidden unit is then multiplied by the connection weight and summed and
interpreted.
The neural network "learns" by repeatedly
passing through the data and adjusting its connection weights to minimize the
error; in this case, the predicted, versus the actual biological activity. A
neural network is thus a mathematical model to describe a non-linear hyper
surface. The increasing interest and availability of neural network software
has prompted several groups to apply this technology in QSAR studies. The application of neural networks as a substitute for discriminant analysis. Neural networks in QSAR in a
manner similar to multiple regression analysis. Comparative
study of neural networks and regression analysis using a set of dihydrofolate reductase
inhibitors. Their results indicated neural networks were superior to
regression analysis in providing accurate predictions, but that the design of
the neural net was critical to obtaining these results13.
First the design of the network is critical with
respect to the number of hidden units involved. The network will over fit or
memorize the data if too many hidden units are used. Conversely, the network
will fail to generalize and become unstable if too few hidden units are used.
The second factor which must be considered is the length of the training time.
It is possible that networks may be overstrained, and thus destabilized,
through the use of excessive training periods. Third, the selection of an
appropriate test set and training set are important. The test set should
adequately represent the entire dataset and be sufficiently large in order to
properly train the neural network. The test set should also be contained within
the neural network model. In addition it should be large enough to provide for
an assessment of the model. Finally, the results obtained from neural networks
can be difficult to interpret and apply to the drug design problem. This issue
is especially troublesome for the medicinal chemist who is not an expert in the
use and interpretation of neural network technology.
We have reported our initial results in the use of
neural networks to identify the descriptors most relevant to biological
activity. The present report describes our continued work in this area and the
enhancements we have made in the methodology. Our objective is to provide
additional tools for the medicinal chemist in the area of molecular design. Our
focus is to apply neural network technology early in the development of the SAR
in a manner that, for the medicinal chemist, is easy to use. A further goal is
to provide a technique whose results are both relevant and interpretable.
Methodology has been developed and incorporated within a program, named AUTONET
that represents a self training neural network. Results from the neural network
are presented visually in order to rapidly and easily convey to the medicinal
chemist the important features derived by the neural network14.
Figure-4: Sample neural network showing connections for
2 inputs and 3 hidden units methods
An artificial neural network consists of a number of
"neurons" or "hidden units" that receive data from the
outside, process the data, and output a signal. A "neuron" is
essentially a regression equation with a non-linear output. When more than one
of these neurons is used, non-linear models can be fitted. These networks have
been shown to work well for modeling a number of different problems, including
QSAR. Neural networks are known for their ability to model a wide set of functions
without knowing the model a priori. The back propagation network
receives a set of inputs which are multiplied by each neuron's weights (Figure
4). These products are summed for each neuron and a non-linear transfer
function is applied. The bias has the effect of shifting the transfer function
to the left or right. The transformed sums are then multiplied by the output
weights where they are summed a final time, transformed, and interpreted. Since
a back-propagation network is a supervised method, the desired output must be
known for each input vector so an error (the difference between the desired
output and the network's predicted output) can be calculated. This error is
propagated backwards through the network (thus the name), adjusting the weights
so that the next time the network sees the same input pattern, it will come
closer to the desired output. The patterns are shown many times until the
network either learns the relation or determines that there is none15.
METHODOLOGY: For our purposes, the input vector and output values
were normalized between 0.1 and 0.9 by column. This ensures that no
exceptionally large valued descriptors will have an undue effect on the
network. All of the connection weights are initialized to very small random
numbers (+/- 0.0005). This is necessary so each hidden unit will respond to a
slightly different feature in the input vector. Each hidden unit outputs the
hyperbolic tangent of the sum of the products of the inputs and the weights
(Equation 1).
The hyperbolic tangent function has a range of -1 to 1,
with the highest gain near 0. This compresses the output of the unit which
defines a maximum contribution for each hidden unit. The output unit takes the
sum of the products of the hidden units and the weights (Equation 2) and
applies the transfer function (Equation 3).
x is the value of the output unit. This function
has minimum and maximum values of 0 and 1 respectively. Since the output values
were normalized between 0.1 and 0.9, this allows the network to slightly exceed
the minimum and maximum values that were given in the original data file16.
Once the output is calculated, it is compared to the
desired output value for that particular vector (the biological activity). An
error,, is calculated according to Equation 4 and is
used in a gradient descent algorithm to adjust the weights of the network
(Equation 5).
= experimental - predicted (4)
Where is the learning rate which controls the step size
of the gradient descent algorithm. The learning rate
is typically between 0 and 1 and is decreased during training as the solution
is reached. The term (out (1-out)) is the derivative of the transfer
function (Equation 3). The hidden unit weights are adjusted in a similar manner
(Equation 6). The term (1 + out) (1 - out) is the derivative of the
hyperbolic tangent transfer function.
Each input vector and desired output pair for the
entire training set was presented to the network and the weights were adjusted.
The training set was generated by sorting all the data based on the output,
biological activity, and then every fourth compound was placed in a testing set
and the remaining compounds were used for the training set. Sorting the small
datasets that are typical of QSAR studies ensured that the test set was as
representative as possible. One complete cycle through the data is called an
epoch. During each epoch, the order in which the compounds were presented was
randomized. This procedure improved the overall performance of the neural network.
The training and testing errors were calculated (Equation 7) for the testing
set every 10 epochs and this value was saved with the network weights.
If the testing error had not decreased in 250 epochs,
the network was returned to that set of weights. Often this technique will
produce a large number of networks of different sizes that have similar
training and testing errors. The best network is the one with the smallest
testing error. The r2 value is also checked since it is possible to
have a small error and a poor r2 value. If there are several
networks of similar errors, the smallest network is often the easiest to
interpret. The model knowledge in the neural network can be discerned by
examining the weights. As the weight from an input descriptor to a hidden unit
approaches zero, then the effect that the descriptor can have on the model
approaches zero. However, it was not obvious which descriptors were
contributing to the model and which only had chance effects since every
descriptor had a weight coefficient. In order to make cutoff criteria, three
random descriptors were included as input vectors. The networks were trained
with these random descriptors along with the other descriptors. Now it was
possible to compare the various chemical descriptors against the random
descriptors to determine which descriptors were more significant than random
noise. A second set of networks was trained with a reduced set of descriptors.
Only those descriptors whose absolute value of the weight coefficient was
larger than the largest of the random descriptors were used. In addition the
descriptors from the best (lowest testing error) network were automatically
included. Often, the networks using this "reduced" set of descriptors
will outperform the original set. This increases the likelihood that a
difficult model can be solved and also that an easy-to-interpret network will
be constructed. Non-linear effects are also determined. An examination of the
weights for a descriptor where the largest weight and the second largest are of
opposite sign and are at least half the magnitude of the largest weight in the
network suggests the presence of a non-linear effect. This tends to identify
non-linear effects with a fewer number of compounds than otherwise possible17.
In order to present the chemist with useful information
from the neural network, certain data are visualized. The hidden unit weights
for each descriptor for each network are displayed in a color map. A green
color indicates that a weight value is near zero, blue is a negative weight,
and red is positive. If an output weight is negative, all of the weights
entering that particular hidden unit are multiplied by (-1). This can be done
since the hyperbolic tangent function is symmetric about the y axis. If a
descriptor has all red weights, then increasing the value of that descriptor
will have a positive effect on the output of the network. The chemist thus can
quickly see which descriptors are consistently having an effect on the
different models. The numerical value of the hidden unit weights is also
available. The chemist can also examine the weights of the best network by
testing error, or if there are several that are close, the smallest hidden unit
size. Each network is given a brightness which is proportional to its testing
error. Random descriptors are given colors as well. The chemist may opt to use
these descriptors in a multiple regression study.
All networks were of the back-propagation type and
trained on a Silicon Graphics Workstation. The program AUTONET is written in
the language C. The networks were trained to predict activity. A hyperbolic
tangent transfer function was used with a user definable learning rate
coefficient between 0.1 and 1.0. All inputs were normalized between 0.1 and
0.9. Multiple networks, at least 3, were developed for each level of hidden
units specified. The number of hidden units was predetermined at discrete
levels of 1,3,5,7,9,11 and 21 depending on the number of compounds in the
dataset. Presentation of inputs to the ANN was randomized after every epoch.
Each network was also started from a randomized input in order to start each
one at a different point on the response surface. Results from all of the
networks were compared. The total number of networks built is two times the
product of the number of passes and the number of hidden units. This is the
combined total of networks constructed with the complete set of descriptors and
the number of networks built with the reduced set of descriptors. For example,
30 networks are built for each training/test set with 3 passes and 5 hidden
units18.
In a typical neural network application, the dataset is
randomly divided into two subsets. One group, the larger of the two, is used to
train the network while the smaller subset is used to evaluate the predictive
power of the network. QSAR datasets are typically small in the early stages of
the project and thus it becomes impractical to reduce them substantially. In
the present study, the datasets were sorted by activity and exemplars were
removed for training purposes by one of two methods. The first method requires
the removal of compounds from the sorted dataset at predetermined intervals.
The second method, used for smaller datasets, is a leave-one-out procedure
requiring the removal of each compound, one at a time, to serve as the test
case. In each case the observation(s) that was removed served as the test case
for the network. If the datasets were not sorted before these techniques were
applied, the learning power of the resulting neural network was compromised.
In order to determine if our method would allow the
neural networks to merely memorize data, even random noise, we conducted the
following experiment. Datasets were created containing 20 "compounds"
and 10, 20, 40, or 80 descriptors of random numbers. The output representing
biological activity was a random number. A network with 13 descriptors (10 descriptors
plus 3 random noises) and 1 hidden unit has 14 adjustable parameters. The same
dataset with 21 hidden units has 274 adjustable parameters and thus a definite
potential to over fit the data exists. With the current methodology using the
leave-one-out protocol, all the networks that were created consistently
memorized the data as evidenced by three characteristics: 1) a low training
error on the learning set and a high testing error on the test set; 2)
generally (>75%) less than 100 epochs in each network; and 3) the hidden
weight coefficients, when examined in the color maps, were similar to known
random noise descriptors.
Dataset 1 (Selwood Dataset): A critical step prior to the construction of a neural
network is the selection of the appropriate size of the training set and the
test set. Care must be taken to ensure that each set is representative of the
other. The training set should be as large as possible in order to provide the
neural network with the best opportunity to learn. The test set should be
sufficiently large to provide new cases in order to fairly evaluate the neural
network. The Selwood dataset was sorted according to
the output response and every fourth record was removed. This created a test
set of 31/4 = 7 records (records 4, 8, 12, 16, 20, 24, 28) and a training set
of 24 records. Training sets were randomized before the first epoch and before
each subsequent epoch. Three descriptors of random numbers were added, the
learning rate was set at 0.6, and a total of 15 networks were generated with
all of the descriptors. Three networks were generated for each hidden unit
level of 1, 3, 5, 7, and 9. The results are depicted in using 3 colors
(positive weight coefficients in red and negative weight coefficients in blue).
The left side of the panel represents the networks that include all of the
descriptors (one descriptor per row and one network per column). Each network
is given a brightness which is proportional to its testing error and thus the
darker appearing columns are those networks with the largest testing error. A
few networks failed to find any descriptor more important than another; as
evidenced by a green vertical bar. However, the remaining networks had low
testing errors and found one or more descriptors to be consistently important
to the learning behavior. The right side of the panel represents 15 new
networks using only those descriptors that frequently provided weights greater
than any of the three random number descriptors. The results indicate that the
learning behavior of the ANNs was most often related to several descriptors.
The descriptors ATCH4, ESDL3 and CLOGP were identified as important to the
learning of the ANN and had positive weight coefficients while ATCH6, DIPV_X,
DIPV_Z, NSDL1, and NSDL7 were important but had negative weight coefficients19.
Regardless of the learning behavior of the individual
neural network, useful information is readily conveyed by the color panels.
Notably the display illustrates that not all network configurations behave
similarly. Some networks failed to learn; as evidenced by the vertical green
columns. The networks with the smaller number of hidden units, 1 and 3 located
at the left edge of the panel, appear to have the best learning results. The r2
training and r2 testing in the output file suggests that these
networks were predictive; although this is not required of the AUTONET derived
neural network nor is it the focus of our interest. Due to our deliberate under
training of the neural network, it is probable that this method found local
minima which may be responsible for the results. Since each network starts from
a different point on the response surface, these minima may or may not be the
same. The actual predictive power of ANN derived from local minima may be
suspect. The critical observation is that the solutions to these local minima
are derived from a similar set of descriptors. The interpretation of the
learning behavior of the network is possible due to the commonality of the
solutions (e.g., the important descriptors for learning) to these local minima
from the entire collection of multiple networks20.
As reported previously multiple regression analysis may
then applied to the dataset using the most important descriptors identified by
the neural network. The Selwood dataset has been the
subject of analysis by numerous approaches. These studies illustrate a general
point in model building that many models exist to explain a dataset. The
purpose of the exercise is to find reasonable models upon which to base
additional experiments.
Dataset 2 (Dunn Dataset): This was a small dataset of only 13 compounds and 5
descriptors as originally reported. We added 58 descriptors in order to better
represent a realistic situation in which no descriptor bias is assumed. Since
this dataset was small we used a true cross validation technique to train the
neural networks in which every compound is removed once to serve as the
training set. For each network 3 descriptors of random data were added and the
learning rate set equal to 0.6. The color scheme for these plates is the same
as described for Dataset 1. The results are more complex than the Selwood results as more networks are produced. The results
following the removal of one compound to serve as the test case are depicted in
a set of 15 columns representing the 15 networks (3 passes x 5 sets of hidden
units). The results in the panel are those from a total of 195 networks. The
interpretation is further complicated by the fact that the first and last
columns are black indicating the network was not able to train properly when
the observations at either extreme are removed to serve as the test cases. This
behavior is indicative of a model from which extrapolations are not possible.
Similarly, darker columns within the body of the color plate indicate those
compounds whose removal produced less satisfactory network training. However
useful information becomes apparent from the color plates. The sigma descriptor
and it component Swain-Lupton R descriptor appear important and have a negative
weight coefficient. Thus one may conclude that the electronic effects of the
substituent may be important to the biological activity of these compounds.
This in agreement with the results from regression analysis reported in
previous studies21.
Dataset 3 (Howbert Dataset): The largest dataset studied contained 47 compounds.
Unlike the previous examples, the dataset represents a classification problem
(active/inactive) based upon the in vivo biological potency. Using the same
methodology as described, the neural networks were trained to identify features
for the correct classification of the compound. Active compounds were
designated with the number 1 and inactive compounds were designated with the
number 0 according to the definition provided by Howbert et.al22. Three neural networks were trained at
each of the hidden unit levels (1,3,5,7,9,11,21)
following the addition of 3 descriptors of random numbers with a learning rate
equal to 0.6.
It is apparent from the results depicted in the neural
networks strongly identified the VDWVOL, a negative weight coefficient, and pi,
a positive weight coefficient, as descriptors important in the correct
classification of the compounds into active or inactive groups. Both of these
descriptors were identified by Howbert et. al. using cluster
significance analysis of their data. Multiple regression analysis did not yield
a statistically valid model. Examination of the training and testing errors
indicate, as expected, the neural network was not predictive23.
The applications of neural networks have required large
datasets and have shown extensive training periods in order to achieve a
predictive solution. We have show using the techniques with the neural network
program AUTONET that it is possible to extract information using a neural
network from relative small datasets and from networks that are not
statistically predictive. The AUTONET method uses a series of multiple, short
training neural networks to provide local minima as solutions. The information
content is extracted from the coefficients of the hidden weights associated
with the input descriptors with the overall solution provided by a consensus of
solutions to the local minima. We found that in spite of short training periods
the neural network memorized random data very quickly. However a characteristic
profile of an overstrained network was identified as: 1) a low training error
on the learning set and a high testing error on the test set; 2) generally
(>75%) less than 100 epochs in each network; and 3) the hidden weight
coefficients similar to each other and similar to the coefficients of know
random noise descriptors. The addition of three descriptors of random numbers
allow for the establishment of a level from which to judge background noise and
chance correlation. The focus of the AUTONET approach is not to achieve a
predictive solution. We have attempted to broaden the scope of the utility of
neural networks in QSAR by gaining information in the absence of a predictive
solution. An often encountered difficulty with neural networks is their lack of
interpretation. The AUTONET approach addresses this through the visual display
of the hidden unit weights and thus rapidly conveys useful and informative
results to the user.
1.
Cangelosi Angelo and Parisi Domenico (1998), Emergence of language in an evolving population of neural networks. Connection
Science 10(2):83-97.
2.
Huston S., (1995). Integrated Strategies in Drug Discovery 23,
19-21.
3.
Ferrer Cancho, Ramon, and Solé, Ricard V., (2002). Zipf's law and random texts. Advances in Complex
Systems 5(1):1-6.
4.
Miller, George A., 1957, Some effects of
intermittent silence, American Journal of
Psychology, 70: 311-314.
5.
Kirby, Simon. (2001). Spontaneous evolution of linguistic structure: an
iterated learning model of the emergence of regularity and irregularity. IEEE Transactions on Evolutionary Computation
5(2): 102-110.
6.
Boyd, D.; Seward, C. M. QSAR: Rational Approaches to the Design of
Bioactive Compounds. Silipo, C.; Vittoria, A., Ed.; Elsevier Science Publishers B. V.:
Amsterdam, 1991; 167-170.
7.
Hansch, C.; Muir, R. M.; Fujita, T.; Maloney, P.P.; Geiger,
F. (1963); Streich, M. The Correlation of Biological
Activity of Plant Growth Regulators and Chloromycetin
Derivatives with Hammett Constants and Partition Coefficients. J. Am. Chem.
Soc. 85, 2817-2824.
8.
Dunn, W. J.; Wold, S.; Edlund,
U.; Hellberg, S.; Gasteiger,
(1984), J. Quant. Struct.-Act. Relat.,
3, 131-137.
9.
Topliss, J. G.; Edwards, R. P., (1979), Chance Factors in
Studies of Quantitative Structure Activity Relationships. J. Med. Chem.,
22, 1238- 1244.
10. McCulloch, W. S.; Pitts, W.
A., (1943), Logical Calculus of the Ideas Immanent in Nervous Activity. Bull.
of Math. Bio., 5, 115-133.
11. Hoskins, J. C.; Himmelbau, D. M., (1988), Artificial Neural Network Models
of Knowledge Representation in Chemical Engineering. Comput.
Chem. Eng., 12, 881-890.
12. (a) Qian, N.; Sejnowski, T. J. (1988), Predicting the Secondary Structure
of Globular Proteins Using Neural Network Models. J. Mol. Biol., 202,
865-884. (b) Bohr, H.; Bohr, J.; Brunak, S.; Cotterill, R.; Lautrup, B.; Norskov, L.; Olsen, O.; Petersen, S. (1988), Protein
Secondary Structure and Homology by Neural Networks, FEBS Lett., 241, 223-228.
13. T. Rives, S. S. (1994)
Prediction of Atomic Ionization Potentials I-III Using an Artificial Neural
Network. J. Chem. Inf. Comput. Sci., 34,
617-620.
14. Rumelhart, D. B. Parallel Distributed Processing,
Feldman, J. A.; Hayes, P. J.; Rumelhart, D. B., Ed.;
The MIT Press, London, 1982, 1, 318-363.
15. Aoyama, T.; Suzuki, Y.;
Ichikawa, H. (1990), Neural Networks Applied to Structure-Activity
Relationships. J. Med. Chem., 33, 905-908.
16. T. A.; Kalayeh,
H. (1991) Application of Neural Networks J. Med. Chem., 34,
2824-2836.
17. Richards, W. G. (1992),
Application of Neural Networks: Quantitative Structure-Activity Relationships
of the derivatives of 2,4-Diamino-5-(substituted-benzyl)
pyrimidines as DHFR Inhibitors. J. Med. Chem.,
35, 3201-3207.
18. Manallack, D. T..; Ellis, D. D.;
Livingstone, D. J. (1994), Analysis of Linear and Nonlinear QSAR Data Using
Neural Networks. J. Med. Chem., 34, 3758-3767.
19. Gakh, A. A.; Gakh,
E. R.; Sumpter, B. G.; Nord, D. W. (1994), Neural
Network-Graph Theory Approach to the Prediction of the Physical Properties of
Organic Compounds. J. Chem. Inf. Comput. Sci.,
34, 832- 839.
20. Wikel, J. H.; Dow, E. R. (1993), The Use of Neural Networks for Variable Selection in QSAR. Bioorg. Med. Chem. Lett.,
3, 645-651.
21. Manallack, D. T.; Livingstone, D. J.
Neural Networks and Expert Systems in Molecular Design. Methods Princ. Med. Chem. 1995, 3 (Advanced
Computer-Assisted Techniques in Drug Discovery), 293-318
22. Selwood, D. L.; Livingstone, D. J.; Comley, J. C.; O'Dowd, A. B.; Hudson, A. T.; Jackson, P.; Jandu, K. S.; Rose, V. S.; Stables, (1990), J. N.
Structure-Activity Relationships of Antifilarial Antimycin Analogues: A Multivariate Pattern Recognition
Study J. Med. Chem., 33, 136-142.
23. Dunn, W. J.; Greenberg, M. J.;
Callejas, S. S. (1976), Use of Cluster Analysis in
the Development of Structure-Activity Relations for Antitumor Triazenes. J.
Med. Chem., 19, 1299-1301.
Received
on 26.11.2010
Accepted on 20.12.2010
© A&V Publication all right reserved
Research
J. Science and Tech. 3(1): Jan.-Feb.
2011: 17-24